Picture for Yoshi Suhara

Yoshi Suhara

Celine

Llama-Nemotron: Efficient Reasoning Models

Add code
May 02, 2025
Viaarxiv icon

When2Call: When (not) to Call Tools

Add code
Apr 26, 2025
Viaarxiv icon

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Add code
Apr 17, 2025
Viaarxiv icon

Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning

Add code
Apr 15, 2025
Viaarxiv icon

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Add code
Apr 10, 2025
Viaarxiv icon

Hymba: A Hybrid-head Architecture for Small Language Models

Add code
Nov 20, 2024
Figure 1 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 2 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 3 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 4 for Hymba: A Hybrid-head Architecture for Small Language Models
Viaarxiv icon

Your Large Language Models Are Leaving Fingerprints

Add code
May 22, 2024
Viaarxiv icon

Large Language Models are Inconsistent and Biased Evaluators

Add code
May 02, 2024
Viaarxiv icon

Source Identification in Abstractive Summarization

Add code
Feb 07, 2024
Viaarxiv icon

Summarizing Community-based Question-Answer Pairs

Add code
Nov 17, 2022
Viaarxiv icon